Preparations

Load the necessary libraries

library(car)       #for regression diagnostics
library(broom)     #for tidy output
library(ggfortify) #for model diagnostics
library(sjPlot)    #for outputs
library(knitr)     #for kable
library(effects)   #for partial effects plots
library(ggeffects) #for partial effects plots
library(emmeans)   #for estimating marginal means
library(MASS)      #for glm.nb
library(MuMIn)     #for AICc
library(tidyverse) #for data wrangling
library(nlme)
library(lme4)
library(glmmTMB)
library(broom.mixed)
library(glmmTMB)   #for glmmTMB
library(DHARMa)   #for residuals and diagnostics
library(performance) #for diagnostic plots
library(see)         #for diagnostic plots
theme_set(theme_classic())

Scenario

To investigate differential metabolic plasticity in barramundi (Lates calcarifer), Norin, Malte, and Clark (2015) exposed juvenile barramundi to various environmental changes (increased temperature, decreased salinity and increased hypoxia) as well as control conditions. Metabolic plasticity was calculated as the percentage difference in standard metabolic rate between the various treatment conditions and the standard metabolic rate under control conditions. They were interested in whether there was a relationship between metabolic plasticity and typical (control) metabolism and how the different treatment conditions impact on this relationship.

A total of 60 barramundi juveniles were subject to each of the three conditions (high temperature, low salinity and hypoxia) in addition to control conditions. Fish mass was also recorded as a covariate as this is known to influence metabolic parameters.

Barramundi

Sampling design

Format of norin.csv data files

fishid mass trial smr_contr change
1 35.69 LowSalinity 5.85 -31.92
2 33.84 LowSalinity 6.53 2.52
3 37.78 LowSalinity 5.66 -6.28
.. .. .. .. ..
1 36.80 HighTemperature 5.85 18.32
2 34.98 HighTemperature 6.53 19.06
3 38.38 HighTemperature 5.66 19.03
.. .. .. .. ..
1 45.06 Hypoxia 5.85 -18.61
2 43.51 Hypoxia 6.53 -5.37
3 45.11 Hypoxia 5.66 -13.95
fishid Categorical listing of the individual fish that are repeatedly sampled
mass Mass (g) of barramundi. Covariate in analysis
trial Categorical listing of the trial (LowSalinity: 10ppt salinity; HighTemperature: 35 degrees; Hypoxia: 45% air-sat. oxygen.
smr_contr Standard metabolic rate (mg/h/39.4 g of fish) under control trial conditions (35 ppt salinity, 29 degrees, normoxia)
change Percentage difference in Standard metabolic rate (mg/h/39.4 g of fish) between Trial conditions and control adjusted for 'regression to the mean'.

Read in the data

norin <- read_csv('../data/norin.csv', trim_ws=TRUE) %>%
  janitor::clean_names() %>%
  mutate(fishid = factor(fishid), trial = factor(trial))
glimpse(norin)
## Rows: 180
## Columns: 5
## $ fishid    <fct> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, …
## $ mass      <dbl> 35.69, 33.84, 37.78, 26.58, 37.62, 37.68, 30.62, 50.37, 24.…
## $ trial     <fct> LowSalinity, LowSalinity, LowSalinity, LowSalinity, LowSali…
## $ smr_contr <dbl> 5.847466, 6.530707, 5.659556, 6.278200, 4.407336, 4.818589,…
## $ change    <dbl> -31.919389, 2.520929, -6.284968, -4.346675, -3.071329, -15.…

Exploratory data analysis

norin %>%
  ggplot(aes(x = trial, y = change)) +
  geom_boxplot()

norin %>% 
  ggplot(aes(x = smr_contr, y = change, shape = trial, color = trial)) +
  geom_smooth(method = "lm") +
  geom_point()

We can see that smr_contr + trial is similar to an ANCOVA design, as we need to see if the lines are parallel or if they are not, indicating an interaction.

norin %>% 
  ggplot(aes(x = mass, y = change, shape = trial, color = trial)) +
  geom_smooth(method = "lm") +
  geom_point()

We can see that mass doesn’t seem to affect change much.

We can add in mass as an offset - doesn’t cost any df, but does soak up the additional variance associated. HOWEVER, it assumes the effect of mass is a 1:1 effect on response. Obviously this is not the case here.

Density is the classic example for using offset. Instead of counts, we can put it in as an offset.

Model formula: \[ y_i \sim{} \mathcal{N}(\mu_i, \sigma^2)\\ \mu_i =\boldsymbol{\beta} \bf{X_i} + \boldsymbol{\gamma} \bf{Z_i} \]

where \(\boldsymbol{\beta}\) and \(\boldsymbol{\gamma}\) are vectors of the fixed and random effects parameters respectively and \(\bf{X}\) is the model matrix representing the overall intercept and effects of temperature and (centered) mean fish size on SDA peak. \(\bf{Z}\) represents a cell means model matrix for the random intercepts associated with individual fish.

Fit the model

Model validation

norin_lmm4 %>% simulateResiduals(plot=T)

## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help. 
##  
## Scaled residual values: 0.14 0.548 0.38 0.476 0.312 0.196 0.288 0.872 0.616 0.644 0.772 0.66 0.54 0.74 0.204 0.288 0.032 0.148 0.02 0.224 ...

Literally near perfect.

Partial plots model

Using sjPlot::plot_model

norin_lmm4 %>% plot_model(type = 'eff', show.data = T) %>% plot_grid() # Misleading due to the assumption that there is no interactions

norin_lmm4 %>% plot_model(type = 'eff', terms = c("smr_contr", "trial"), 
                          show.data = T) # put both plots together

Using emmeans:

norin_lmm4 %>% ggemmeans(~smr_contr|trial) %>% plot(add.data=T) # assumes no interaction

Note: that ggemmeans seems to cause errors with the x-axis coordinates of the points! This seems to be an error!

Model investigation / hypothesis testing

Using REML fit for interpretation.

summary(norin_lmm4)
##  Family: gaussian  ( identity )
## Formula:          
## change ~ trial * scale(smr_contr, scale = FALSE) + (trial | fishid)
## Data: norin
## 
##      AIC      BIC   logLik deviance df.resid 
##   1594.0   1635.5   -784.0   1568.0      173 
## 
## Random effects:
## 
## Conditional model:
##  Groups   Name             Variance Std.Dev. Corr        
##  fishid   (Intercept)       333.013 18.249               
##           trialHypoxia      355.899 18.865   -0.62       
##           trialLowSalinity 1051.420 32.426   -0.21 -0.05 
##  Residual                     8.452  2.907               
## Number of obs: 180, groups:  fishid, 60
## 
## Dispersion estimate for gaussian family (sigma^2): 8.45 
## 
## Conditional model:
##                                                  Estimate Std. Error z value
## (Intercept)                                        50.382      2.386  21.119
## trialHypoxia                                      -50.226      2.493 -20.150
## trialLowSalinity                                  -42.223      4.220 -10.006
## scale(smr_contr, scale = FALSE)                   -21.553      3.821  -5.641
## trialHypoxia:scale(smr_contr, scale = FALSE)       13.511      3.992   3.384
## trialLowSalinity:scale(smr_contr, scale = FALSE)   10.189      6.758   1.508
##                                                  Pr(>|z|)    
## (Intercept)                                       < 2e-16 ***
## trialHypoxia                                      < 2e-16 ***
## trialLowSalinity                                  < 2e-16 ***
## scale(smr_contr, scale = FALSE)                  1.69e-08 ***
## trialHypoxia:scale(smr_contr, scale = FALSE)     0.000714 ***
## trialLowSalinity:scale(smr_contr, scale = FALSE) 0.131648    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Evidence of slope x intercept correlation for high salin vs. hypoxia and high salin (-0.62) vs. low salin (r = -0.21), but not super important for model interpretation.

Very high correlations between variables indicates that bizarre stuff is happening…

cov2cor(vcov(norin_lmm4)$cond)
##                                                    (Intercept)  trialHypoxia
## (Intercept)                                       1.000000e+00 -6.203202e-01
## trialHypoxia                                     -6.203202e-01  1.000000e+00
## trialLowSalinity                                 -2.150696e-01 -3.513196e-02
## scale(smr_contr, scale = FALSE)                  -3.745067e-15  3.350838e-15
## trialHypoxia:scale(smr_contr, scale = FALSE)      1.768497e-15 -1.991780e-15
## trialLowSalinity:scale(smr_contr, scale = FALSE)  1.990724e-15 -1.511568e-15
##                                                  trialLowSalinity
## (Intercept)                                         -2.150696e-01
## trialHypoxia                                        -3.513196e-02
## trialLowSalinity                                     1.000000e+00
## scale(smr_contr, scale = FALSE)                     -4.128113e-15
## trialHypoxia:scale(smr_contr, scale = FALSE)         2.613165e-15
## trialLowSalinity:scale(smr_contr, scale = FALSE)     8.487862e-16
##                                                  scale(smr_contr, scale = FALSE)
## (Intercept)                                                        -4.634123e-15
## trialHypoxia                                                        2.549910e-15
## trialLowSalinity                                                    5.469717e-16
## scale(smr_contr, scale = FALSE)                                     1.000000e+00
## trialHypoxia:scale(smr_contr, scale = FALSE)                       -6.203202e-01
## trialLowSalinity:scale(smr_contr, scale = FALSE)                   -2.150696e-01
##                                                  trialHypoxia:scale(smr_contr, scale = FALSE)
## (Intercept)                                                                      3.524682e-15
## trialHypoxia                                                                    -2.260970e-15
## trialLowSalinity                                                                -6.714704e-16
## scale(smr_contr, scale = FALSE)                                                 -6.203202e-01
## trialHypoxia:scale(smr_contr, scale = FALSE)                                     1.000000e+00
## trialLowSalinity:scale(smr_contr, scale = FALSE)                                -3.513196e-02
##                                                  trialLowSalinity:scale(smr_contr, scale = FALSE)
## (Intercept)                                                                         -3.082333e-15
## trialHypoxia                                                                         1.418294e-15
## trialLowSalinity                                                                     2.128408e-15
## scale(smr_contr, scale = FALSE)                                                     -2.150696e-01
## trialHypoxia:scale(smr_contr, scale = FALSE)                                        -3.513196e-02
## trialLowSalinity:scale(smr_contr, scale = FALSE)                                     1.000000e+00

For random effect terms: x+(x|g) = 1+x+(1+x|g) –> Correlated random intercept and slope terms x+(x||g) = 1+x+(1|g)+(0+x|g) –> Uncorrelated random intercept and slope terms

Predictions / further analyses

summary(norin_lmm4)
##  Family: gaussian  ( identity )
## Formula:          
## change ~ trial * scale(smr_contr, scale = FALSE) + (trial | fishid)
## Data: norin
## 
##      AIC      BIC   logLik deviance df.resid 
##   1594.0   1635.5   -784.0   1568.0      173 
## 
## Random effects:
## 
## Conditional model:
##  Groups   Name             Variance Std.Dev. Corr        
##  fishid   (Intercept)       333.013 18.249               
##           trialHypoxia      355.899 18.865   -0.62       
##           trialLowSalinity 1051.420 32.426   -0.21 -0.05 
##  Residual                     8.452  2.907               
## Number of obs: 180, groups:  fishid, 60
## 
## Dispersion estimate for gaussian family (sigma^2): 8.45 
## 
## Conditional model:
##                                                  Estimate Std. Error z value
## (Intercept)                                        50.382      2.386  21.119
## trialHypoxia                                      -50.226      2.493 -20.150
## trialLowSalinity                                  -42.223      4.220 -10.006
## scale(smr_contr, scale = FALSE)                   -21.553      3.821  -5.641
## trialHypoxia:scale(smr_contr, scale = FALSE)       13.511      3.992   3.384
## trialLowSalinity:scale(smr_contr, scale = FALSE)   10.189      6.758   1.508
##                                                  Pr(>|z|)    
## (Intercept)                                       < 2e-16 ***
## trialHypoxia                                      < 2e-16 ***
## trialLowSalinity                                  < 2e-16 ***
## scale(smr_contr, scale = FALSE)                  1.69e-08 ***
## trialHypoxia:scale(smr_contr, scale = FALSE)     0.000714 ***
## trialLowSalinity:scale(smr_contr, scale = FALSE) 0.131648    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
tidy(norin_lmm4, conf.int=T, effect='fixed')

Significant slope between hypoxia and physiology, not significant between low salinity and physiology, but missing comparison of high salinity and physiology! Use emtrends to do this.

emtrends(norin_lmm4, pairwise ~ trial, var = "smr_contr")
## $emtrends
##  trial           smr_contr.trend   SE  df lower.CL upper.CL
##  HighTemperature          -21.55 3.82 173    -29.1   -14.01
##  Hypoxia                   -8.04 3.41 173    -14.8    -1.32
##  LowSalinity              -11.36 7.01 173    -25.2     2.48
## 
## Confidence level used: 0.95 
## 
## $contrasts
##  contrast                      estimate   SE  df t.ratio p.value
##  HighTemperature - Hypoxia       -13.51 3.99 173 -3.384  0.0025 
##  HighTemperature - LowSalinity   -10.19 6.76 173 -1.508  0.2898 
##  Hypoxia - LowSalinity             3.32 7.97 173  0.417  0.9087 
## 
## P value adjustment: tukey method for comparing a family of 3 estimates

Hypoxia vs. high temp significant, everything else is not significant.

norin_lmm4 %>% 
  emmeans(~ trial) %>% # get differences
  regrid() %>%  # does backtransform of the differences BEFORE pair-wise comparisons
  pairs() %>% # get pair-wise absolute values of differences
  confint() # get confidence interval limits for these differences

Planned contrasts

We wish to compare the pair-wise differences among these trials at specific values of physiology (smr_contr). To do so, we can create a prediction grid, then use emmeans:

norin_grid <- with(norin, 
                   list(smr_contr = c(min(smr_contr), 
                                      mean(smr_contr),
                                      max(smr_contr))))
emmeans(norin_lmm1, pairwise ~ trial|smr_contr, at = norin_grid)
## $emmeans
## smr_contr = 3.98:
##  trial           emmean   SE  df lower.CL upper.CL
##  HighTemperature  75.99 6.93 172    62.32    89.66
##  Hypoxia           5.81 6.93 172    -7.86    19.49
##  LowSalinity      20.93 6.93 172     7.26    34.61
## 
## smr_contr = 5.18:
##  trial           emmean   SE  df lower.CL upper.CL
##  HighTemperature  52.29 3.21 172    45.96    58.62
##  Hypoxia          -3.83 3.21 172   -10.16     2.51
##  LowSalinity      10.23 3.21 172     3.90    16.57
## 
## smr_contr = 6.70:
##  trial           emmean   SE  df lower.CL upper.CL
##  HighTemperature  22.10 8.45 172     5.41    38.79
##  Hypoxia         -16.11 8.45 172   -32.79     0.58
##  LowSalinity      -3.39 8.45 172   -20.08    13.29
## 
## Confidence level used: 0.95 
## 
## $contrasts
## smr_contr = 3.98:
##  contrast                      estimate    SE  df t.ratio p.value
##  HighTemperature - Hypoxia         70.2  8.47 172  8.289  <.0001 
##  HighTemperature - LowSalinity     55.1  8.47 172  6.503  <.0001 
##  Hypoxia - LowSalinity            -15.1  8.47 172 -1.786  0.1773 
## 
## smr_contr = 5.18:
##  contrast                      estimate    SE  df t.ratio p.value
##  HighTemperature - Hypoxia         56.1  3.92 172 14.309  <.0001 
##  HighTemperature - LowSalinity     42.1  3.92 172 10.724  <.0001 
##  Hypoxia - LowSalinity            -14.1  3.92 172 -3.586  0.0013 
## 
## smr_contr = 6.70:
##  contrast                      estimate    SE  df t.ratio p.value
##  HighTemperature - Hypoxia         38.2 10.33 172  3.698  0.0008 
##  HighTemperature - LowSalinity     25.5 10.33 172  2.468  0.0386 
##  Hypoxia - LowSalinity            -12.7 10.33 172 -1.231  0.4368 
## 
## P value adjustment: tukey method for comparing a family of 3 estimates

Shows the differences at each specific level

r.squaredGLMM(norin_lmm4)
##            R2m       R2c
## [1,] 0.4942093 0.9927262
performance::r2_nakagawa(norin_lmm4)
## # R2 for Mixed Models
## 
##   Conditional R2: 0.993
##      Marginal R2: 0.494

Summary figures

norin_grid <- with(norin, 
                   list(smr_contr = modelr::seq_range(smr_contr, n=100)))
newdata = emmeans(norin_lmm4, ~ smr_contr|trial, at = norin_grid) %>%
  as.data.frame() %>%
  rename(change = emmean, lwr = lower.CL, upr = upper.CL)

(g1 <- newdata %>%
  ggplot(aes(x = smr_contr, y = change)) +
  geom_ribbon(aes(ymin = lwr, ymax = upr, fill = trial), alpha = 0.3) +
  geom_line(aes(color = trial)) +
  theme(legend.position = c(0.99, 0.99),
        legend.justification = c(1,1)))

  # geom_point(data = norin, aes(col = trial)) # apparently is bad?

However, we can’t add the plot the same way, as the model data values are standardized in a specific way according to the random effect

Thus, we have to add data a different way… Step 1. Get predicted value for each individual via augment Step 2. Add on the x variables lost via augment Step 3. Add on residual value to obtain standardized values

obs <- augment(norin_lmm4) %>% 
  bind_cols(dplyr::select(norin, smr_contr)) %>%
  mutate(partial_obs = .fitted + .resid)

g1 + 
  # geom_point(data = norin, aes(col = trial)) # looks the same, but in models where the other variables not being plotted explain a lot of the variation, will be drastically different!
  geom_point(data = obs, aes(y = partial_obs, color = trial))

References

Norin, Tommy, Hans Malte, and Timothy D. Clark. 2015. “Differential Plasticity of Metabolic Rate Phenotypes in a Tropical Fish Facing Environmental Change.” Functional Ecology 30 (3): 369–78. https://doi.org/10.1111/1365-2435.12503.